Add modules utils under mix_utils folder to avoid duplication in JSON… #36

fabrizio-turchi · 2024-02-19T13:36:15Z

Add the utils.py module under mix_utils folder to:

avoid duplication in the JSON/CASE file generated by the parsers (UFED, AXIOM etc.)
amends the data to either make it homogeneous or get rid of some dirty characters extracted from the XML reports.

The module is shared among all parsers (UFED, AXIOM, OXYGEN and XAMN/XRY)

… file and uniform some date and JSON content

…ce to an uco-identity class

…et class

ajnelson-nist · 2024-02-20T13:43:36Z

Hi @fabrizio-turchi ,

If you are in progress on pushing changes, please leave this PR in Draft status until you are done and ready for it to be reviewed for merging.

Also, your first commit includes a compiled .pyc file. Can you please git rm that, and also include a patch updating .gitignore to ignore __pycache__ and *.pyc? I leave it up to you if you want to revise the history to purge that file.

fabrizio-turchi · 2024-02-20T13:58:40Z

Hi Alex, I'll remove the .pyc files and add them to the .gitignore, sorry it slipped my mind! I stop working on the repo for the moment, so no worry I'll wait for your review. Best regards. F. [cid:0a7159de-aa31-483f-a0a1-da5a8c203905] [facebook]<https://www.facebook.com/CNRsocialFB> [twitter] <https://twitter.com/CNRsocial_> [instagram] <https://www.instagram.com/cnrsocial/> [linkedin] <https://www.linkedin.com/company/283032> Fabrizio Turchi CNR, ISTITUTO DI INFORMATICA GIURIDICA E SISTEMI GIUDIZIARI [cid:69336b52-6f6a-4130-8edf-0cd2e5bc5fc6] Tel. +39 055 4399672 Cell. +39 328 36 23 632 ***@***.*** via de' Barucci, 20, 50121 – Firenze [cid:856ad76e-5a6b-4d2f-bf5a-0b6b2254710b] www.cnr.it<http://www.cnr.it/> Devolvi il 5×1000 al CNR CF 80054330586

…

________________________________ From: Alex Nelson ***@***.***> Sent: 20 February 2024 14:43 To: casework/CASE-Mapping-Python ***@***.***> Cc: Fabrizio Turchi ***@***.***>; Mention ***@***.***> Subject: Re: [casework/CASE-Mapping-Python] Add modules utils under mix_utils folder to avoid duplication in JSON… (PR #36) Hi @fabrizio-turchi<https://github.com/fabrizio-turchi> , If you are in progress on pushing changes, please leave this PR in Draft status until you are done and ready for it to be reviewed for merging. Also, your first commit includes a compiled .pyc file. Can you please git rm that, and also include a patch updating .gitignore to ignore __pycache__ and *.pyc? I leave it up to you if you want to revise the history to purge that file. — Reply to this email directly, view it on GitHub<#36 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AEDCCDIQTCGO3XUHSUB2TVLYUSSBJAVCNFSM6AAAAABDPPLSSCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJUGI2DKNRTGM>. You are receiving this because you were mentioned.Message ID: ***@***.***>

ajnelson-nist

Two things:

.gitignore should prevent __pycache__ and *.pyc files from being committed. It might also be good to purge those files from the first patch, whether in a force-push on this PR or in a separate PR.
The adjust_json function looks like it could use further explanation because of in-memory data modification. I'd personally prefer a doctest be added and run as part of the module test suite if we confirm the behavior's necessary. (See these lines for a doctest demonstration.)

mix_utils/utils.py

… from mix_utils/utils.py module

ajnelson-nist

I think a pytest is needed. There might be runtime issues in some configurations of Python, e.g. Python 3.8.

mix_utils/utils.py

… pytest

fabrizio-turchi · 2024-02-23T09:29:38Z

@ajnelson-nist I turned the class into a function at model level. The test_duplicate.py contains all the ways I invoke the code in the UFED parser. The JSON objects taken into account in the test are a simplified version of the ones included in the parser code but it doesn't affect the result of the test. Feel free to make any kind of comment for the benefit of the final result.

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

ajnelson-nist · 2024-02-23T14:54:14Z

I realized a few things running pytest locally. The GitHub Action isn't running on this pull request; it seems we'd forgotten to turn on Actions for pull requests. I'm pushing that change to your main now.

Also, thank you again for using the typing symbols from Python <3.9. But, I'd forgotten that 3.8 is not supported by this package. The CASE validation action only operates in 3.9 or later, and the pyproject.toml file already specifies needing 3.9.

Thank you for the pytest. I was expecting it would operate on your new module as a package, but where you put the test made me realize your code isn't positioned to be a part of the cdo_mapping package. With where the code is, you and any end-users would only be able to use it by GIt submodule incorporation. The mix_utils directory would need to move to under /case_mapping/ to be exposed as part of the package.

The way to see this is to have the file test_duplicate.py move to under /tests, and then get its import line working again. My understanding of your intended use case is that your new module should be usable with an import like how /example.py has the line import case_mapping.

Last, be aware the node identifier style resolves to "@id": str(uuid.uuid4()), and without a @base statement in a graph-wide context dictionary, that attempted @id statement will not work as JSON-LD. (I forget if it will be silently dropped because the UUID is just a string, or if it will break in some other way, but case_validate would fail validation on the generated JSON because, at best, all of the nodes would be blank nodes.) I appreciate this test is designed to not think about the entire graph, but the @ids should at least read (after all the variables are resolved) "@id": "http://example.org/kb/" + str(uuid.uuid4()). kb: could also be fine as a prefix, as that's what's used almost everywhere across the CASE examples.

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

…lasses

fabrizio-turchi · 2024-02-26T13:42:04Z

@ajnelson-nist I tried to use the example module to test the FacetUrlHistory and UrlHistoryEntry classes, but it raises the error: "PackageNotFoundError: case-mapping". Inside the init.py of the package there's a reference to case-mapping but the folder is case_mapping. Could you have a look to it?

ajnelson-nist · 2024-02-26T14:28:58Z

@fabrizio-turchi : I have a guess why you're seeing that, but first, it would be helpful for review purposes if you would run make in your development environment. This will install pre-commit, by running this recipe. Then if you make any unimportant change to case_mapping/uco/observable.py---say, adding a new blank line at the end of the file---and attempt to commit the change, your code will be formatted as the Continuous Integration in this repository requires. You'll need to commit the changes pre-commit makes, but they should just be code formatting.

Back to your question - it sounds like the library is not installed in your testing environment. This is a reasonable expectation from tests being in separate directories from the package source.

If you run make, a virtual environment will be built (separate from the one pre-commit is installed into - the background on why is in this PR if you're curious), and the local case_mapping package will be installed into it. Activating that virtual environment will let you run pytest:

make
source venv/bin/activate  # Begin use of virtual environment
pytest
deactivate  # End use of virtual environment

make check in this project happens to include running pytest, in this recipe. (The commands pytest and poetry run pytest happen to do the same thing for me.)

If you follow the above steps, you'll find there was a missed str() cast in your last update, but you shouldn't have any trouble with packages.

fabrizio-turchi · 2024-02-27T14:30:29Z

@ajnelson-nist I wouldn't like to bother you about the error I run into yesterday.
I changed the example.py adding the FacetUrlHistory and UrlHistoryEntry Observables, but I had to change the line version = importlib.metadata.version("case-mapping") with version = "xxx". In this way I could run the example to test the changes and later I restored the line.
The error remains the same inside the package: importlib.metadata.PackageNotFoundError: case-mapping

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Reported-by: Keith Chason <kchason@users.noreply.github.com> Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

case_mapping/uco/observable.py

fabrizio-turchi · 2024-02-29T08:47:30Z

@ajnelson-nist I changed the code in the FacetUrlHistory class to fix the errors raised by case_validate. I didn't find a better way to fill in the uco-observable:urlHistoryEntry property. The code is borrowed by the _str_vars, _str_datetime ... methods of the FacetEntity class, but I couldn't use those methods. If you find a more elegant solution, do not hesitate to change teh code or suggest to me how to do it. 10x!

JSON-LD needs the value with the `@value` key to be a JSON string. Some undesired type-coercion could happen if ints or floats are used. Python type signatures are added to check data flow consistency. Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

If `Tuple[str, ...]` is used, then each individual positional argument must be a tuple of strings. See PEP-484. `Any` is used instead, to allow for `float`s in the accompanying unit test, and `dict`s in future applications. With these changes, `/mix_utils` passes `mypy`. References: * https://peps.python.org/pep-0484/#arbitrary-argument-lists-and-default-argument-values Disclaimer: Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose. Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

With these changes, `/mix_utils` passes `mypy --strict`. Disclaimer: Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose. Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

The migrated test script demonstrates the module import pattern for consumers. Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

ajnelson-nist · 2024-03-12T15:51:43Z

@fabrizio-turchi : I've moved some files around so you can meet your original goal of getting a shareable, importable library. /mix_utils/utils.py has moved to case_mapping/mix_utils.py, because otherwise you would not be able to import utils.py as a module. The test you'd added under /mix_utils has moved to /tests, to demonstrate that once case_mapping is installed in a Python environment, mix_utils.py is accessible by import.

I've also added Python type reviews on all of the files that were under /mix_utils, because I caught one subtle JSON-LD issue (a necessary string casting) and I wasn't quite following some of the data-type flow without a signature being written.

If you look at the git log --stat messages, you'll see why I added those changes.

I'm approving this PR. If you agree with the changes, please feel free to merge, and then I'll start reviewing #38 .

Note: I did not review changes pertaining to other Issues. I only checked that the unit testing (case_validate, pytest, and mypy --strict on the mix-utils files) passes. I still think it is fine to merge this PR so 38 can be returned to, and if you think other Issues are resolved with the extra changes in this PR, please feel free to close them as addressed.

fabrizio-turchi added 4 commits February 19, 2024 14:32

Add modules utils under mix_utils folder to avoid duplication in JSON…

5466c19

… file and uniform some date and JSON content

Fix casework#22 issue, adding BrowserBookmarkFacet to observable module

6a14af9

Fix casework#3 issue, changing uco-tool:creator property as a referen…

f33a8f5

…ce to an uco-identity class

Fix casework#13 issue, replacing EventFacet class with EventRecordFac…

bb70ee7

…et class

ajnelson-nist requested changes Feb 20, 2024

View reviewed changes

mix_utils/utils.py Outdated Show resolved Hide resolved

fabrizio-turchi added 3 commits February 20, 2024 15:28

Change .gitignore to ignore the .pyc files nad the __pycache__ folder

fc0ac4d

Add type checking to mix_utils/util-py module

1e33b70

Remove AdjustDate class, useful only for UFED parser, and white space…

0a491b8

… from mix_utils/utils.py module

ajnelson-nist requested changes Feb 21, 2024

View reviewed changes

mix_utils/utils.py Outdated Show resolved Hide resolved

Change structure mix_utils/utils.py module. Add test_duplicate.py for…

ea81a41

… pytest

Enable CI on pull requests

1d97f40

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

ajnelson-nist and others added 2 commits February 23, 2024 07:57

Apply formatting

6ebfcc7

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Fix casework#12 issue, changing FacetUrlHistory and UrlHistoryEntry c…

169c32b

…lasses

Reformatted example.py

df8c25e

ajnelson-nist added 3 commits February 28, 2024 08:53

Apply formatting

e888d6f

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Deactivate validation report

05d7684

Reported-by: Keith Chason <kchason@users.noreply.github.com> Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Fix data typing

26a516f

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

ajnelson-nist reviewed Feb 28, 2024

View reviewed changes

case_mapping/uco/observable.py Outdated Show resolved Hide resolved

Reformatted line by Flake by CI action

2d48c7f

ajnelson-nist added 3 commits March 12, 2024 11:15

ajnelson-nist added 3 commits March 12, 2024 11:35

Link Issue

e72002e

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Regenerate Make-managed file

6aaa572

Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

Integrate mix_utils directory into package

491c4f8

The migrated test script demonstrates the module import pattern for consumers. Signed-off-by: Alex Nelson <alexander.nelson@nist.gov>

ajnelson-nist approved these changes Mar 12, 2024

View reviewed changes

fabrizio-turchi merged commit 4ad2ddb into casework:main Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add modules utils under mix_utils folder to avoid duplication in JSON… #36

Add modules utils under mix_utils folder to avoid duplication in JSON… #36

Uh oh!

fabrizio-turchi commented Feb 19, 2024

Uh oh!

ajnelson-nist commented Feb 20, 2024

Uh oh!

fabrizio-turchi commented Feb 20, 2024 via email

Uh oh!

ajnelson-nist left a comment

Uh oh!

Uh oh!

ajnelson-nist left a comment

Uh oh!

Uh oh!

fabrizio-turchi commented Feb 23, 2024

Uh oh!

ajnelson-nist commented Feb 23, 2024

Uh oh!

fabrizio-turchi commented Feb 26, 2024

Uh oh!

ajnelson-nist commented Feb 26, 2024

Uh oh!

fabrizio-turchi commented Feb 27, 2024

Uh oh!

Uh oh!

fabrizio-turchi commented Feb 29, 2024

Uh oh!

ajnelson-nist commented Mar 12, 2024

Uh oh!

Uh oh!

Add modules utils under mix_utils folder to avoid duplication in JSON… #36

Add modules utils under mix_utils folder to avoid duplication in JSON… #36

Uh oh!

Conversation

fabrizio-turchi commented Feb 19, 2024

Uh oh!

ajnelson-nist commented Feb 20, 2024

Uh oh!

fabrizio-turchi commented Feb 20, 2024 via email

Uh oh!

ajnelson-nist left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ajnelson-nist left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fabrizio-turchi commented Feb 23, 2024

Uh oh!

ajnelson-nist commented Feb 23, 2024

Uh oh!

fabrizio-turchi commented Feb 26, 2024

Uh oh!

ajnelson-nist commented Feb 26, 2024

Uh oh!

fabrizio-turchi commented Feb 27, 2024

Uh oh!

Uh oh!

fabrizio-turchi commented Feb 29, 2024

Uh oh!

ajnelson-nist commented Mar 12, 2024

Uh oh!

Uh oh!